Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 256
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38568767

RESUMO

Health disparities among marginalized populations with lower socioeconomic status significantly impact the fairness and effectiveness of healthcare delivery. The increasing integration of artificial intelligence (AI) into healthcare presents an opportunity to address these inequalities, provided that AI models are free from bias. This paper aims to address the bias challenges by population disparities within healthcare systems, existing in the presentation of and development of algorithms, leading to inequitable medical implementation for conditions such as pulmonary embolism (PE) prognosis. In this study, we explore the diversity of biases in healthcare systems, which highlights the need for a holistic framework to reduce bias by complementary aggregation. By leveraging de-biasing deep survival prediction models, we propose a framework that disentangles identifiable information from images, text reports, and clinical variables to mitigate potential biases within multimodal datasets. Our study offers several advantages over traditional clinical-based survival prediction methods, including richer survival-related characteristics and bias-complementary predicted results. By improving the robustness of survival analysis through this framework, we aim to benefit patients, clinicians, and researchers by improving fairness and accuracy in healthcare AI systems. The code is available at https://github.com/zzs95/fairPE-SA.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38587963

RESUMO

Despite providing high-performance solutions for computer vision tasks, the deep neural network (DNN) model has been proved to be extremely vulnerable to adversarial attacks. Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked. Besides, commonly used adaptive learning and fine-tuning technique is unsuitable for adversarial defense since it is essentially a zero-shot problem when deployed. Thus, to tackle this challenge, we propose an attack-agnostic defense method named Meta Invariance Defense (MID). Specifically, various combinations of adversarial attacks are randomly sampled from a manually constructed Attacker Pool to constitute different defense tasks against unknown attacks, in which a student encoder is supervised by multi-consistency distillation to learn the attack-invariant features via a meta principle. The proposed MID has two merits: 1) Full distillation from pixel-, feature- and prediction-level between benign and adversarial samples facilitates the discovery of attack-invariance. 2) The model simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration. Theoretical and empirical studies on numerous benchmarks such as ImageNet verify the generalizable robustness and superiority of MID under various attacks.

3.
Neural Netw ; 174: 106227, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38452663

RESUMO

Supervised learning-based image classification in computer vision relies on visual samples containing a large amount of labeled information. Considering that it is labor-intensive to collect and label images and construct datasets manually, Zero-Shot Learning (ZSL) achieves knowledge transfer from seen categories to unseen categories by mining auxiliary information, which reduces the dependence on labeled image samples and is one of the current research hotspots in computer vision. However, most ZSL methods fail to properly measure the relationships between classes, or do not consider the differences and similarities between classes at all. In this paper, we propose Adaptive Relation-Aware Network (ARAN), a novel ZSL approach that incorporates the improved triplet loss from deep metric learning into a VAE-based generative model, which helps to model inter-class and intra-class relationships for different classes in ZSL datasets and generate an arbitrary amount of high-quality visual features containing more discriminative information. Moreover, we validate the effectiveness and superior performance of our ARAN through experimental evaluations under ZSL and more practical GZSL settings on three popular datasets AWA2, CUB, and SUN.


Assuntos
Conhecimento
4.
Artigo em Inglês | MEDLINE | ID: mdl-38536698

RESUMO

Face stylization has made notable progress in recent years. However, when training on limited data, the performance of existing approaches significantly declines. Although some studies have attempted to tackle this problem, they either failed to achieve the few-shot setting (less than 10) or can only get suboptimal results. In this article, we propose GAN Prior Distillation (GPD) to enable effective few-shot face stylization. GPD contains two models: a teacher network with GAN Prior and a student network that fulfills end-to-end translation. Specifically, we adapt the teacher network trained on large-scale data in the source domain to the target domain using a handful of samples, where it can learn the target domain's knowledge. Then, we can achieve few-shot augmentation by generating source domain and target domain images simultaneously with the same latent codes. We propose an anchor-based knowledge distillation module that can fully use the difference between the training and the augmented data to distill the knowledge of the teacher network into the student network. The trained student network achieves excellent generalization performance with the absorption of additional knowledge. Qualitative and quantitative experiments demonstrate that our method achieves superior results than state-of-the-art approaches in a few-shot setting.

5.
Comput Biol Med ; 172: 108284, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38503086

RESUMO

3D MRI Brain Tumor Segmentation is of great significance in clinical diagnosis and treatment. Accurate segmentation results are critical for localization and spatial distribution of brain tumors using 3D MRI. However, most existing methods mainly focus on extracting global semantic features from the spatial and depth dimensions of a 3D volume, while ignoring voxel information, inter-layer connections, and detailed features. A 3D brain tumor segmentation network SDV-TUNet (Sparse Dynamic Volume TransUNet) based on an encoder-decoder architecture is proposed to achieve accurate segmentation by effectively combining voxel information, inter-layer feature connections, and intra-axis information. Volumetric data is fed into a 3D network consisting of extended depth modeling for dense prediction by using two modules: sparse dynamic (SD) encoder-decoder module and multi-level edge feature fusion (MEFF) module. The SD encoder-decoder module is utilized to extract global spatial semantic features for brain tumor segmentation, which employs multi-head self-attention and sparse dynamic adaptive fusion in a 3D extended shifted window strategy. In the encoding stage, dynamic perception of regional connections and multi-axis information interactions are realized through local tight correlations and long-range sparse correlations. The MEFF module achieves the fusion of multi-level local edge information in a layer-by-layer incremental manner and connects the fusion to the decoder module through skip connections to enhance the propagation ability of spatial edge information. The proposed method is applied to the BraTS2020 and BraTS2021 benchmarks, and the experimental results show its superior performance compared with state-of-the-art brain tumor segmentation methods. The source codes of the proposed method are available at https://github.com/SunMengw/SDV-TUNet.


Assuntos
Neoplasias Encefálicas , Humanos , Neoplasias Encefálicas/diagnóstico por imagem , Benchmarking , Neuroimagem , Semântica , Processamento de Imagem Assistida por Computador
6.
IEEE Trans Image Process ; 33: 2419-2430, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38517712

RESUMO

Due to the sparse single-frame annotations, current Single-Frame Temporal Action Localization (SF-TAL) methods generally employ threshold-based pseudo-label generation strategies. However, these approaches suffer from inefficient data utilization, as only parts of unlabeled frames with confidence scores surpassing a predefined threshold are selected for training. Moreover, the variability of single-frame annotations and unreliable model predictions introduce pseudo-label noise. To address these challenges, we propose two strategies by using the relationship of the video segments with their neighbors': 1) temporal neighbor-guided soft pseudo-label generation (TNPG); and 2) semantic neighbor-guided pseudo-label refinement (SNPR). TNPG utilizes a local-global self-attention mechanism in a transformer encoder to capture temporal neighbor information while focusing on the whole video. Then the generated self-attention map is multiplied by the network predictions to propagate information between labeled and unlabeled frames, and produce soft pseudo-label for all segments. Despite this, label noise persists due to unreliable model predictions. To mitigate this, SNPR refines pseudo-labels based on the assumption that predictions should resemble their semantic nearest neighbors'. Specifically, we search for semantic nearest neighbors of each video segment by cosine similarity in the feature space. Then the refined soft pseudo-labels can be obtained by a weight combination of the original pseudo-label and the semantic nearest neighbors'. Finally, the model can be trained with the refined pseudo-labels, and the performance has been greatly improved. Comprehensive experimental results on different benchmarks show that we achieve state-of-the-art performances on THUMOS14, ActivityNet1.2, and ActivityNet1.3 datasets.

7.
Asia Pac J Ophthalmol (Phila) ; 13(1): 100033, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38383075

RESUMO

PURPOSE: To investigate the effectiveness and safety of phacogoniotomy versus phacotrabeculectomy (PVP) among patients with advanced primary angle-closure glaucoma (PACG) and cataracts. DESIGN: Multicenter, randomized controlled, non-inferiority trial. METHODS: A total of 124 patients (124 eyes) with advanced PACG and cataracts were enrolled, with 65 in the phacogoniotomy group and 59 in the phacotrabeculectomy group. Patients were followed up for 12 months with standardized evaluations. The primary outcome was the reduction in intraocular pressure (IOP) from baseline to 12 months postoperatively, of which a non-inferiority margin of 4 mmHg was evaluated. Secondary outcomes included the cumulative surgical success rate, postoperative complications, and changes in the number of glaucoma medications. RESULTS: After 12 months, phacogoniotomy demonstrated non-inferiority to phacotrabeculectomy in terms of IOP reduction, with mean IOP reductions of - 26.1 mmHg and - 25.7 mmHg (P = 0.383), respectively, from baseline values of around 40 mmHg. Both groups experienced a significant reduction in the mean number of medications used postoperatively (P < 0.001). The cumulative success rate was comparable between the groups (P = 0.890). However, phacogoniotomy had a lower rate of postoperative complications and interventions (12.3% and 4.6%) compared to phacotrabeculectomy (23.7% and 20.3% respectively). The phacogoniotomy group reported shorter surgery time (22.1 ± 6.5 vs. 38.8 ± 11.1 min; P = 0.030) and higher quality of life (EQ-5D-5 L) improvement at 12 months (7.0 ± 11.5 vs. 3.0 ± 12.9, P = 0.010) than the phacotrabeculectomy group. CONCLUSIONS: Phacogoniotomy was non-inferior to phacotrabeculectomy in terms of IOP reduction for advanced PACG and cataracts. Additionally, phacogoniotomy provided a shorter surgical time, lower postoperative complication rate, fewer postoperative interventions, and better postoperative quality of life.


Assuntos
Catarata , Glaucoma de Ângulo Fechado , Facoemulsificação , Trabeculectomia , Humanos , Catarata/complicações , Glaucoma de Ângulo Fechado/complicações , Glaucoma de Ângulo Fechado/cirurgia , Pressão Intraocular , Complicações Pós-Operatórias/epidemiologia , Qualidade de Vida , Resultado do Tratamento
8.
Cell Rep ; 43(2): 113799, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38367239

RESUMO

Schlemm's canal (SC) functions to maintain proper intraocular pressure (IOP) by draining aqueous humor and has emerged as a promising therapeutic target for glaucoma, the second-leading cause of irreversible blindness worldwide. However, our current understanding of the mechanisms governing SC development and functionality remains limited. Here, we show that vitronectin (VTN) produced by limbal macrophages promotes SC formation and prevents intraocular hypertension by activating integrin αvß3 signaling. Genetic inactivation of this signaling system inhibited the phosphorylation of AKT and FOXO1 and reduced ß-catenin activity and FOXC2 expression, thereby causing impaired Prox1 expression and deteriorated SC morphogenesis. This ultimately led to increased IOP and glaucomatous optic neuropathy. Intriguingly, we found that aged SC displayed downregulated integrin ß3 in association with dampened Prox1 expression. Conversely, FOXO1 inhibition rejuvenated the aged SC by inducing Prox1 expression and SC regrowth, highlighting a possible strategy by targeting VTN/integrin αvß3 signaling to improve SC functionality.


Assuntos
Glaucoma , Hipertensão , Doenças do Nervo Óptico , Humanos , Idoso , Integrina alfaVbeta3 , Canal de Schlemm , Macrófagos
9.
Artigo em Inglês | MEDLINE | ID: mdl-38417787

RESUMO

BACKGROUND: Preterm infants with low birthweight are at heightened risk of developmental sequelae, including neurological and cognitive dysfunction that can persist into adolescence or adulthood. In addition, preterm birth and low birthweight can provoke changes in endocrine and metabolic processes that likely impact brain health throughout development. However, few studies have examined associations among birthweight, pubertal endocrine process, long-term neurological and cognitive development. METHODS: We investigated the associations between birthweight and brain morphometry, cognitive function, and onset of adrenarche assessed 9 to 11 years later in 3571 preterm and full-term children using the Adolescent Brain Cognitive Development dataset. RESULTS: The preterm children showed lower birthweight and early adrenarche, as expected. Birthweight was positively associated with cognitive function (all︱Cohen's d︱> 0.154, P < 0.005), global brain volumes (all︱Cohen's d︱> 0.170, P < 0.008) and regional volumes in frontal, temporal, and parietal cortices in preterm and full-term children (all︱Cohen's d︱> 0.170, P < 0.0007); and cortical volume in the lateral orbitofrontal cortex (lOFC) partially mediated the effect of low birthweight on cognitive function in preterm children. In addition, adrenal score and cortical volume in the lOFC mediated the associations between birthweight and cognitive function only in preterm children. CONCLUSIONS: These findings highlight the impact of low birthweight on long-term brain structural and cognitive function development, and showed important associations with early onset of adrenarche during the puberty. This understanding may help with prevention and treatment.

10.
IEEE Trans Image Process ; 33: 1508-1521, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38363668

RESUMO

The key to multi-object tracking is its stability and the retention of identity information. A common problem with most detection-based approaches is trusting and using all the detector outputs for the association. However, some settings of detectors can affect stable long-range tracking. Based on the principle of reducing the association noise in the detection processing step, we propose a new framework, the Box application Pattern Mining Tracker (BPMTrack), to address this issue. Specifically, we worked on three main aspects: output threshold, association strategy, and motion model. Due to the problem of inconsistency between classification scores and localization accuracy, we propose the Box Quality Estimation Network (BQENet) to predict the localization quality scores of all detections in the current frame, reserving high-quality boxes for the tracker. In addition, based on observations of intensive scenarios, we propose a simple and effective data association method, the Non-Maximum Suppression Integration (NMSI) matching strategy. It recovers the Non-Maximum Suppression (NMS) detection, inputs them into BQENet, and then performs hierarchical matching with reasonable control of box priority to alleviate the problem of absent objects caused by occlusion. Finally, we propose an improved Measurement Correct and Noise Scale (MCNS) Kalman algorithm to improve the prediction accuracy of object positions and, thus, the association quality. We performed an extensive ablation evaluation of the proposed framework to prove its effectiveness. Moreover, the three tracking benchmarks show our method's accuracy and long-distance performance.

11.
Artigo em Inglês | MEDLINE | ID: mdl-38409281

RESUMO

Children with ADHD show abnormal brain function and structure. Neuroimaging studies found that stimulant medications may improve brain structural abnormalities in children with ADHD. However, prior studies on this topic were conducted with relatively small sample sizes and wide age ranges and showed inconsistent results. In this cross-sectional study, we employed latent class analysis and linear mixed-effects models to estimate the impact of stimulant medications using demographic, clinical measures, and brain structure in a large and diverse sample of children aged 9-11 from the Adolescent Brain and Cognitive Development Study. We studied 273 children with low ADHD symptoms and received stimulant medication (Stim Low-ADHD), 1002 children with high ADHD symptoms and received no medications (No-Med ADHD), and 5378 typically developing controls (TDC). After controlling for the covariates, compared to Stim Low-ADHD and TDC, No-Med ADHD showed lower cortical thickness in the right insula (INS, d = 0.340, PFDR = 0.003) and subcortical volume in the left nucleus accumbens (NAc, d = 0.371, PFDR = 0.003), indicating that high ADHD symptoms were associated with structural abnormalities in these brain regions. In addition, there was no difference in brain structural measures between Stim Low-ADHD and TDC children, suggesting that the stimulant effects improved both ADHD symptoms and ADHD-associated brain structural abnormalities. These findings together suggested that children with ADHD appear to have structural abnormalities in brain regions associated with saliency and reward processing, and treatment with stimulant medications not only improve the ADHD symptoms but also normalized these brain structural abnormalities.

12.
Artigo em Inglês | MEDLINE | ID: mdl-38285580

RESUMO

Deep learning methods have achieved impressive performance in compressed video quality enhancement tasks. However, these methods rely excessively on practical experience by manually designing the network structure and do not fully exploit the potential of the feature information contained in the video sequences, i.e., not taking full advantage of the multiscale similarity of the compressed artifact information and not seriously considering the impact of the partition boundaries in the compressed video on the overall video quality. In this article, we propose a novel Mixed Difference Equation inspired Transformer (MDEformer) for compressed video quality enhancement, which provides a relatively reliable principle to guide the network design and yields a new insight into the interpretable transformer. Specifically, drawing on the graphical concept of the mixed difference equation (MDE), we utilize multiple cross-layer cross-attention aggregation (CCA) modules to establish long-range dependencies between encoders and decoders of the transformer, where partition boundary smoothing (PBS) modules are inserted as feedforward networks. The CCA module can make full use of the multiscale similarity of compression artifacts to effectively remove compression artifacts, and recover the texture and detail information of the frame. The PBS module leverages the sensitivity of smoothing convolution to partition boundaries to eliminate the impact of partition boundaries on the quality of compressed video and improve its overall quality, while not having too much impacts on non-boundary pixels. Extensive experiments on the MFQE 2.0 dataset demonstrate that the proposed MDEformer can eliminate compression artifacts for improving the quality of the compressed video, and surpasses the state-of-the-arts (SOTAs) in terms of both objective metrics and visual quality.

13.
IEEE Trans Image Process ; 33: 972-986, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38241117

RESUMO

Due to many unmarked data, there has been tremendous interest in developing unsupervised feature selection methods, among which graph-guided feature selection is one of the most representative techniques. However, the existing feature selection methods have the following limitations: (1) All of them only remove redundant features shared by all classes and neglect the class-specific properties; thus, the selected features cannot well characterize the discriminative structure of the data. (2) The existing methods only consider the relationship between the data and the corresponding neighbor points by Euclidean distance while neglecting the differences with other samples. Thus, existing methods cannot encode discriminative information well. (3) They adaptively learn the graph in the original or embedding space. Thus, the learned graph cannot characterize the data's cluster structure. To solve these limitations, we present a novel unsupervised discriminative feature selection via contrastive graph learning, which integrates feature selection and graph learning into a uniform framework. Specifically, our model adaptively learns the affinity matrix, which helps characterize the data's intrinsic and cluster structures in the original space and the contrastive learning. We minimize l1,2 -norm regularization on the projection matrix to preserve class-specific features and remove redundant features shared by all classes. Thus, the selected features encode discriminative information well and characterize the discriminative structure of the data. Generous experiments indicate that our proposed model has state-of-the-art performance.

14.
IEEE J Biomed Health Inform ; 28(2): 929-940, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37930923

RESUMO

Semi-supervised learning methods have been explored to mitigate the scarcity of pixel-level annotation in medical image segmentation tasks. Consistency learning, serving as a mainstream method in semi-supervised training, suffers from low efficiency and poor stability due to inaccurate supervision and insufficient feature representation. Prototypical learning is one potential and plausible way to handle this problem due to the nature of feature aggregation in prototype calculation. However, the previous works have not fully studied how to enhance the supervision quality and feature representation using prototypical learning under the semi-supervised condition. To address this issue, we propose an implicit-explicit alignment (IEPAlign) framework to foster semi-supervised consistency training. In specific, we develop an implicit prototype alignment method based on dynamic multiple prototypes on-the-fly. And then, we design a multiple prediction voting strategy for reliable unlabeled mask generation and prototype calculation to improve the supervision quality. Afterward, to boost the intra-class consistency and inter-class separability of pixel-wise features in semi-supervised segmentation, we construct a region-aware hierarchical prototype alignment, which transmits information from labeled to unlabeled and from certain regions to uncertain regions. We evaluate IEPAlign on three medical image segmentation tasks. The extensive experimental results demonstrate that the proposed method outperforms other popular semi-supervised segmentation methods and achieves comparable performance with fully-supervised training methods.


Assuntos
Aprendizado de Máquina Supervisionado , Processamento de Imagem Assistida por Computador
15.
Psychol Med ; 54(2): 409-418, 2024 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37365781

RESUMO

BACKGROUND: Preterm birth is a global health problem and associated with increased risk of long-term developmental impairments, but findings on the adverse outcomes of prematurity have been inconsistent. METHODS: Data were obtained from the baseline session of the ongoing longitudinal Adolescent Brain and Cognitive Development (ABCD) Study. We identified 1706 preterm children and 1865 matched individuals as Control group and compared brain structure (MRI data), cognitive function and mental health symptoms. RESULTS: Results showed that preterm children had higher psychopathological risk and lower cognitive function scores compared to controls. Structural MRI analysis indicated that preterm children had higher cortical thickness in the medial orbitofrontal cortex, parahippocampal gyrus, temporal and occipital gyrus; smaller volumes in the temporal and parietal gyrus, cerebellum, insula and thalamus; and smaller fiber tract volumes in the fornix and parahippocampal-cingulum bundle. Partial correlation analyses showed that gestational age and birth weight were associated with ADHD symptoms, picvocab, flanker, reading, fluid cognition composite, crystallized cognition composite and total cognition composite scores, and measures of brain structure in regions involved with emotional regulation, attention and cognition. CONCLUSIONS: These findings suggest a complex interplay between psychopathological risk and cognitive deficits in preterm children that is associated with changes in regional brain volumes, cortical thickness, and structural connectivity among cortical and limbic brain regions critical for cognition and emotional well-being.


Assuntos
Nascimento Prematuro , Criança , Feminino , Adolescente , Recém-Nascido , Humanos , Encéfalo/patologia , Cognição/fisiologia , Recém-Nascido Prematuro , Estudos Longitudinais , Imageamento por Ressonância Magnética/métodos
16.
Artigo em Inglês | MEDLINE | ID: mdl-38147424

RESUMO

Electroencephalography (EEG) and surface electromyography (sEMG) have been widely used in the rehabilitation training of motor function. However, EEG signals have poor user adaptability and low classification accuracy in practical applications, and sEMG signals are susceptible to abnormalities such as muscle fatigue and weakness, resulting in reduced stability. To improve the accuracy and stability of interactive training recognition systems, we propose a novel approach called the Attention Mechanism-based Multi-Scale Parallel Convolutional Network (AM-PCNet) for recognizing and decoding fused EEG and sEMG signals. Firstly, we design an experimental scheme for the synchronous collection of EEG and sEMG signals and propose an ERP-WTC analysis method for channel screening of EEG signals. Then, the AM-PCNet network is designed to extract the time-domain, frequency-domain, and mixed-domain information of the EEG and sEMG fusion spectrogram images, and the attention mechanism is introduced to extract more fine-grained multi-scale feature information of the EEG and sEMG signals. Experiments on datasets obtained in the laboratory have shown that the average accuracy of EEG and sEMG fusion decoding is 96.62%. The accuracy is significantly improved compared with the classification performance of single-mode signals. When the muscle fatigue level reaches 50% and 90%, the accuracy is 92.84% and 85.29%, respectively. This study indicates that using this model to fuse EEG and sEMG signals can improve the accuracy and stability of hand rehabilitation training for patients.


Assuntos
Eletroencefalografia , Mãos , Humanos , Eletromiografia/métodos , Eletroencefalografia/métodos , Mãos/fisiologia , Fadiga Muscular , Extremidade Superior
17.
Artigo em Inglês | MEDLINE | ID: mdl-38090852

RESUMO

Nowadays, data in the real world often comes from multiple sources, but most existing multi-view K-Means perform poorly on linearly non-separable data and require initializing the cluster centers and calculating the mean, which causes the results to be unstable and sensitive to outliers. This paper proposes an efficient multi-view K-Means to solve the above-mentioned issues. Specifically, our model avoids the initialization and computation of clusters centroid of data. Additionally, our model use the Butterworth filters function to transform the adjacency matrix into a distance matrix, which makes the model is capable of handling linearly inseparable data and insensitive to outliers. To exploit the consistency and complementarity across multiple views, our model constructs a third tensor composed of discrete index matrices of different views and minimizes the tensor's rank by tensor Schatten p-norm. Experiments on two artificial datasets verify the superiority of our model on linearly inseparable data, and experiments on several benchmark datasets illustrate the performance.

18.
Artigo em Inglês | MEDLINE | ID: mdl-38153833

RESUMO

In minimally invasive surgery videos, label-free monocular laparoscopic depth estimation is challenging due to smoke. For this reason, we propose a self-supervised collaborative network-based depth estimation method with smoke-removal for monocular endoscopic video, which is decomposed into two steps of smoke-removal and depth estimation. In the first step, we develop a de-endoscopic smoke for cyclic GAN (DS-cGAN) to mitigate the smoke components at different concentrations. The designed generator network comprises sharpened guide encoding module (SGEM), residual dense bottleneck module (RDBM) and refined upsampling convolution module (RUCM), which restores more detailed organ edges and tissue structures. In the second step, high resolution residual U-Net (HRR-UNet) consisting of a DepthNet and two PoseNets is designed to improve the depth estimation accuracy, and adjacent frames are used for camera self-motion estimation. In particular, the proposed method requires neither manual labeling nor patient computed tomography scans during the training and inference phases. Experimental studies on the laparoscopic data set of the Hamlyn Centre show that our method can effectively achieve accurate depth information after net smoking in real surgical scenes while preserving the blood vessels, contours and textures of the surgical site. The experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods in effectiveness and achieves a frame rate of 94.45fps in real time, making it a promising clinical application.

19.
IEEE Trans Image Process ; 32: 6234-6247, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37943636

RESUMO

Remarkable achievements have been obtained with binary neural networks (BNN) in real-time and energy-efficient single-image super-resolution (SISR) methods. However, existing approaches often adopt the Sign function to quantize image features while ignoring the influence of image spatial frequency. We argue that we can minimize the quantization error by considering different spatial frequency components. To achieve this, we propose a frequency-aware binarized network (FABNet) for single image super-resolution. First, we leverage the wavelet transformation to decompose the features into low-frequency and high-frequency components and then employ a "divide-and-conquer" strategy to separately process them with well-designed binary network structures. Additionally, we introduce a dynamic binarization process that incorporates learned-threshold binarization during forward propagation and dynamic approximation during backward propagation, effectively addressing the diverse spatial frequency information. Compared to existing methods, our approach is effective in reducing quantization error and recovering image textures. Extensive experiments conducted on four benchmark datasets demonstrate that the proposed methods could surpass state-of-the-art approaches in terms of PSNR and visual quality with significantly reduced computational costs. Our codes are available at https://github.com/xrjiang527/FABNet-PyTorch.

20.
Artigo em Inglês | MEDLINE | ID: mdl-37943647

RESUMO

Pawlak rough set (PRS) and neighborhood rough set (NRS) are the two most common rough set theoretical models. Although the PRS can use equivalence classes to represent knowledge, it is unable to process continuous data. On the other hand, NRSs, which can process continuous data, rather lose the ability of using equivalence classes to represent knowledge. To remedy this deficit, this article presents a granular-ball rough set (GBRS) based on the granular-ball computing combining the robustness and the adaptability of the granular-ball computing. The GBRS can simultaneously represent both the PRS and the NRS, enabling it not only to be able to deal with continuous data and to use equivalence classes for knowledge representation as well. In addition, we propose an implementation algorithm of the GBRS by introducing the positive region of GBRS into the PRS framework. The experimental results on benchmark datasets demonstrate that the learning accuracy of the GBRS has been significantly improved compared with the PRS and the traditional NRS. The GBRS also outperforms nine popular or the state-of-the-art feature selection methods. We have open-sourced all the source codes of this article at http://www.cquptshuyinxia.com/GBRS.html, https://github.com/syxiaa/GBRS.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...